Natural number recognition using MCE trained inter-word context dependent acoustic models
نویسندگان
چکیده
Among applications that require number recognition, the focus has largely been on connected digit recognizers. In this paper, we introduce an acoustic model topology for natural number recognition by using minimum classification error (MCE) training of inter-word context dependent models of the head-body-tail (HBT) type. Experimental results on natural number applications involving dollar amounts and U.S. telephone numbers show that using HBT models for natural number data reduces string error rates by as much as 25% over context independent whole word models. In addition, for speech input which is strictly of connected digit type, the increase in string error rates is negligible when a natural number telephone grammar is used instead of a connected digit telephone grammar. This will enable natural number speech recognition systems to be more widely accepted because recognition accuracy is maintained while permitting a more natural and flexible user interface.
منابع مشابه
Natural number recognition using discriminatively trained inter-word context dependent hidden Markov models
Many automatic speech recognition telephony applications involve recognition of input containing some type of numbers. Traditionally, this has been achieved by using isolated or connected digit recognizers. However, as speech recognition finds a wider range of applications, it is often infeasible to impose restrictions on speaker behavior. This paper studies two model topologies for natural num...
متن کاملAllophone-based acoustic modeling for Persian phoneme recognition
Phoneme recognition is one of the fundamental phases of automatic speech recognition. Coarticulation which refers to the integration of sounds, is one of the important obstacles in phoneme recognition. In other words, each phone is influenced and changed by the characteristics of its neighbor phones, and coarticulation is responsible for most of these changes. The idea of modeling the effects o...
متن کاملNovel filler acoustic models for connected digit recognition
The context-dependent modeling technique is extended to include non-speech ller segments occurring between speech word units. In addition to the conventional context-dependent word or subword units, the proposed acoustic modeling provides an e cient way of accounting for the effects of the surrounding speech on the inter-word non-speech segments, especially for small vocabulary recognition task...
متن کاملImproved Acoustic Modeling for Continuous Speech Recognition
We report on some recent improvements to an HMMbased, continuous speech recognition system which is being developed at AT&T Bell Laboratories. These advances, which include the incorporation of inter-word, context-dependent units and an improved feature analysis, lead to a recognition system which achieves better than 95% word accuracy for speaker independent recognition of the 1000-word, DARPA...
متن کاملTowards age-independent acoustic modeling
In automatic speech recognition applications, due to significant differences in voice characteristics, adults and children are usually treated as two population groups, for which different acoustic models are trained. In this paper, age-independent acoustic modeling is investigated in the context of large vocabulary speech recognition. Exploiting a small amount (9 hours) of children’s speech an...
متن کامل